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Abstract — The purpose of this paper is to study metrics suitable 
for assessing uncertainty of power spectra when these are based 
on finite second-order statistics. The family of power spectra 
which is consistent with a given range of values for the estimated 
statistics represents the uncertainty set about the "true" power 
spectrum. Our aim is to quantify the size of this uncertainty set 
using suitable notions of distance, and in particular, to compute 
the diameter of the set since this represents an upper bound 
on the distance between any choice of a nominal element in the 
set and the "true" power spectrum. Since the uncertainty set 
may contain power spectra with lines and discontinuities, it is 
natural to quantify distances in the weak topology — the topology 
denned by continuity of moments. We provide examples of 
such weakly-continuous metrics and focus on particular metrics 
for which we can explicitly quantify spectral uncertainty. We 
then consider certain high resolution techniques which utilize 
filter-banks for pre-processing, and compute worst-case a priori 
uncertainty bounds solely on the basis of the filter dynamics. 
This allows the a priori tuning of the filter-banks for improved 
resolution over selected frequency bands. 

Index Terms — Robust spectral estimation, uncertainty set, 
spectral distances, geometry of spectral measures, THREE filter 
design. 



I. Introduction 

IN practice, the estimation of power spectra in stationary 
time-series often relies on second-order statistics. The 
premise is that these are moments of an underlying power 
spectral distribution — the true power spectrum. Thus, the 
question arises as to how much is "knowable" about the 
distribution of power in the spectrum from such statistics. 

Asymptotically, as more data accrue the convergence is 
guaranteed in a suitable sense, but the practical question 
remains on how to bound the error when only limited infor- 
mation is available. To this end, it is important to consider 
how a finite set of statistics localizes the power spectrum. 
Traditionally, for many applications, one relies on a particular 
power spectrum selected out of a variety of methods that 
lead to specific choices, all consistent (in different ways) with 
the recorded data and the estimated moments. Historically, 
Burg's algorithm and the maximum entropy spectrum, and the 
Pisarenko harmonic decomposition are specific such choices 
fl26) . (48] and so are the correlogram and the periodogram. 
Thus, in general, there exists a large family of admissible 
power spectra which are all consistent. Bounding the values of 
admissible spectral density functions is an ill-posed problem 
(see Section |IV| ). Instead, the natural way to quantify power 
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spectral uncertainty is by bounding the power on (measurable) 
subsets of the frequency band. Therefore, the goal of this paper 
is to consider the appropriate topology-the so-called weak 
topology, and to develop suitable metrics that can be used 
to quantify and measure power spectral uncertainty. 

Throughout, we consider stochastic processes {y t : t £ Z} 
which are discrete-time, zero-mean, and second-order station- 
ary. A typical set of statistics for a stationary stochastic process 
is a finite set of covariance (or, autocorrelation) samples. The 
covariance samples 



£{ytVt-k\, for k = 0, ±1, ±2, . . . , ±n, 



where £{■} denotes the expectation operator, provide moment 
constraints for the power spectrum d/j, of the process: 
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dfi(6) for k = 0,±l,±2...,±n. (1) 



The power spectrum is thought of as a non-negative measure 
on the unit circle T = {z = e lB : 8 £ (— 7r,7r]} (for 
notational simplicity also identified with the interval (— ir,ir]). 
We use the symbol DJl to denote the class of such measures 
and the problem of determining d/i £ 9Jt from the covariance 
samples (finitely or infinitely many) is known as the trigono- 
metric moment problem. Classical theory on this problem 
originates in the work of Toeplitz and Caratheodory at the 
turn of the 20 th century and has evolved into a rather deep 
chapter of functional analysis and of operator theory HI, l35l . 
11221 . |[T2l . (4). The classical monograph by Geronimus l22l 
contains a wide range of results on the trigonometric moment 
problem, the asymptotic behavior of solutions, spectral factors 
and optimal predictors, as well as explicit expressions for 
spectral envelops |22| Theorem 5.7] (c.f. 0, |26), |[T4l ). 
A more general form in which statistics may be available 
is when these represent the state covariance, or the output 
covariance, of a dynamical system driven by the stochastic 
process of interest. Such a dynamical system may represent 
a model of physical processing (bandpass filtering at sensor 
locations, losses, structure of sensor array, etc.) or of virtual 
processing (software-based) of the original time-series. Either 
way, covariance statistics represent (generalized) moments 
of the power spectrum and a theory which is completely 
analogous to the theory of the trigonometric moment problem 
is available and provides similar conclusions, see J6], Q, fl4l . 

02) , lfl6l . In fact, the use of generalized statistics, which 
relates to beamspace processing, was explored in Q, lfT31 
as a way to improve resolution in power spectral estimation 
over selected frequency bands. More recent work addresses 
spectral estimation with priors, computational issues, as well 
as important multivariate generalizations J3], Q, J9), iflOl . 

03) , 03. 0S1, Go), EQ, ED, io), E), m). 

The framework of the present work involves such moment 
problems specified by covariance statistics. Invariably, moment 
statistics are estimated from a finite observation record and are 
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known with limited accuracy. Thus, in a typical experiment, as 
the observation record of a time-series increases so does the 
accuracy and the length of the estimated partial covariance 
sequence. Our goal is to develop metrics that can be used to 
quantify spectral uncertainty. More specifically, phrased in the 
context of the trigonometric moment problem, we seek metrics 
between power spectra that have the following properties: 

(i) given a finite set of covariance samples, the family of 
consistent power spectra has a finite diameter, and 

(ii) the diameter of the uncertain set of power spectra shrinks 
to zero as both, the accuracy of the covariance samples 
increases and their number tends to infinity. 

The latter condition is dictated by the fact that the trigono- 
metric moment problem is known to be determined, i.e., 
there is a unique power spectrum which is consistent with 
an infinite sequence of covariances. As we will explain below 
(in Section iHll ). the proper topology which allows for these 
properties to hold is the weak topology on measures (cf., 
|27l page 8]). There is a variety of metrics that can be used 
to metrize this topology, and thus, in principle, to quantify 
spectral uncertainty. A contribution of this work is to suggest 
a class of metrics for which the radius of spectral uncertainty 
and a priori bounds are computable given a finite set of (error- 
free) statistics. 

In Section [TT] we review the trigonometric moment problem 
and relevant concepts in functional analysis. In Section [III] 
we define power-spectral uncertainty sets and discuss the 
relevance of weakly continuous metrics. In Section [IV] we 
present a collection of weakly continuous metrics that, in 
different ways, are suitable for metrizing the space of power 
spectra. In Section [V] we compute the diameter of uncertainty 
sets, for a particular choice of a metric, and elaborate on the 
limit properties of this uncertainty quantification. In Section 
[vH we present an example that elucidates the relevance and 
applicability of the results in practice. In Section IVIII we 
explain how the framework applies in the context of gen- 
eralized statistics. In Section IVIIII we highlight the use of 
this quantification of uncertainty in filter design — we show 
how to tune a filter-bank so as limit spectral uncertainty over 
some frequency range of interest. In the concluding section 
(Section IIXI) we summarize the results and outline possible 
future directions. 

II. The trigonometric moment problem, spectral 

REPRESENTATIONS, AND WEAK CONVERGENCE 

The covariances Ck, k = 0, ±1,±2, . . ., of a stationary 
random process {y t : t G Z} are the Fourier coefficients 
of the spectral measure dfi as in (Q]). These are characterized 
by the non-negativity of the Toeplitz matrices 



[25, page 148]. Because dfi is a real measure, Ck = c-k for 
k = 0, 1, . . ., hence we use only positive indices and refer by 



c c_i 
ci c 



C— n 
C-n+1 

CO 



for n = 0,1,.... When T n > for n < k and singular 
for n = k + 1, then it is also singular for all n > k and 
rank(Tfe + £) = rank(Tfc) = fc + 1 for all I > 1. In this case, dfi 
is singular with respect to the Lebesgue measure and consists 
of finitely many "spectral lines," equal in number to rank(T„) 



C() : 



(Co, Ci, . . . , c„) 



to the vector of the first (n + 1) moments, and by 



c := 



(Co, Ci, . . .) 



to the infinite sequence. The sequence c is said to be positive 
if T n > for all n. Similarly CQ. n is said to be positive if 
T n > 0. Accordingly, the term non-negative is used when the 
relevant Toeplitz matrices are non-negative definite. 

As noted in the introduction, the power spectrum of a 
discrete-time stationary process is a bounded non-negative 
measure on the unit circle. The derivative (of its absolutely 
continuous part) is referred to as the spectral density function, 
while the singular part typically contains jumps (spectral 
lines) associated with the presence of sinusoidal components. 
In general, the singular part may have a more complicated 
mathematical structure that allocates "energy" on a set of 
measure zero without the need for distinct spectral lines l25l 
page 5]. From a mathematical viewpoint such spectra are 
important as they represent limits of more palatable spectra, 
and hence, represent a form of completion. 

The natural topology where such limits ought to be con- 
sidered is the so-called weak topology. This topology is also 
known as the weak* topology in functional analysis-a term 
which is less frequently used in the context of measures. The 
weak topology is defined in terms of convergence of linear 
functionals and is explained next. We denote by C (T) the class 
of real-valued continuous functions on T. It is quite standard 
that the space of bounded linear functionals A : C(T) —> K., 
can be identified with the space of bounded measures on T 
ll27l page 7]. More specifically, any bounded functional A can 
be represented in the form 

A(/) = / f(t)dp(t) for all / 6 C(T), 

with dfi being the corresponding measure-this is the Riesz 
representation theorem. Continuous functions now serve as 
"test functions" to differentiate between measures. Bounds 
on the corresponding integrals define the weak topology: a 
sequence of measures dfi n , n = 1,2,..., converges to dp, in 
the weak topology if J fdfi n — > J fdfi for every / S C(T). 
Thus, for any two measures that are different, there exists a 
continuous function that the two measures integrate to different 
values. In this setting, a measure can be specified uniquely by 
its Fourier coefficients. In fact, given a positive sequence c, 
the unique corresponding measure dp, can be determined as the 
limit in the weak topology of finite Fourier sums or Cesaro 
means l27l page 24]. 

Non-negative measures are naturally associated with ana- 
lytic and harmonic functions — a connection which has been 
exploited in classical circuit theory in the context of passivity. 
Herglotz' theorem (T) states that if dfi is a bounded non- 
negative measure on T, then 



H[dp](z) = ±- 



+ z 



dfi{6) 



(2) 
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is analytic in IB) := {z : \z\ < 1} and the real part is non- 
negative. Such functions are referred to as either "positive- 
real" or, as Caratheodory functions. Conversely, any positive- 
real function can be represented (modulo an imaginary con- 
stant) by the above formula for a suitable non-negative mea- 
sure. The Poisson integral of a non-negative measure dji 



P[dl4(z) := i- j* P r 



(t - 9)dfj,(6), z = re 1 



(3) 



where P r (9) = prz^iirre is the Poisson kernel, is a harmonic 
function which is non-negative in B and is equal to the real part 
of H[d/j](z). Given either a positive-real function H{z), or its 
real part P(z), the measure d/j, such that H(z) = H[dfi](z) 
and P(z) = P[d/j](z) is uniquely determined by the limit of 
P{re %e )d9 — > d/i as r — > 1 in the weak topology |27. page 
33]. Thus, power spectra are, in a very precise sense, boundary 
limits of the (harmonic) real parts of positive-real functions. 

III. Uncertainty of spectral estimates 

We postulate a situation where covariances Co : „ are esti- 
mated from sample of a stochastic process {y t }tez with power 
spectrum dv, and where the estimation error in the entries of 
Co:„ are bounded by Thus, the "true" spectrum dv belongs 
to the uncertainty set 



J Ci 



dfi>0 



-ik9 



dfi 



< e,k = 0,1,. 



Likewise, any choice for a "nominal" spectrum dv consistent 
with our assumptions will also belong to J- COrl . e . Therefore, 
the distance between the two will be bounded by the diameter 
of the uncertainty set, 

P6(Fco:n,e) ■= sup{5(dfJ, ,dfj,i) : d^, a ,d^i 6j C0; „ ie }, 

where 5 is a suitable metric at hand. Thus, our goal in this 
paper is to seek metrics S on the space of positive measures 
DJl that provide a meaningful and computationally tractable 
notion of a diameter for J- Co . n ,e thereby quantifying modeling 
uncertainty in the spectral domain. To narrow down the search 
for suitable metrics, consider the scenario when the length 
of the data increases, and hence the accuracy as well as 
the number of covariance lags increases. In the limit, as the 
estimation error goes to zero and the number n of covariance 
lags goes to infinity, the uncertainty set shrinks to the singleton 



{dv} = p| jt C( 



(e„ — > as n — > oo). 



This is due to the fact that an infinite limit sequence c 
defines a unique power spectrum-the trigonometric problem 
is determinate. The diameter should reflect this shrinkage to a 
singleton and tend to zero. For this to happen, the underlying 
metric needs to be weakly continuous as stated next. 
Theorem 1: Let 6 be a metric on 9)1. Then 



P<5(-^c 0: „,e„) — > as e„ — ► and n — > oo, 



(4) 



for every covariance sequence c if and only if S is weakly 
continuous. 

'The more realistic situation, where the confidence intervals degrade with 
the order of covariance lags, can be dealt with in a similar manner, albeit with 
a bit more cumbersome notation. 



Proof: This can be seen by comparing the definition of 
J^co .„,e„ with the definition of open sets in the weak topology. 
See the appendix for a detailed proof. ■ 

Remark 2: Occasionally one may have additional a priori 
knowledge on the structure and smoothness of the power spec- 
trum which would further limit the uncertainty set. Quantifying 
such "structured" uncertainty would necessarily be problem- 
specific and is not considered in the present work. Instead, 
we take a viewpoint that allows comparing power spectra in a 
unified way, regardless smoothness, presence of spectral lines, 
or membership in a specific class of models. □ 

We now consider the case where the finite covariance 
sample co : „ is known exactly. If Co : ,i is positive, then the 
uncertainty set 



P c , 



dfi>0:c k 



-ike 



dji, k — 0, 1, 



contains infinitely many power spectra. If co :ra is only non- 
negative, and hence T n is singular, then the family J 7 ^.^ 
consists of the single power spectrum dv ll25l page 148]. The 
following two results are immediate corollaries of Theorem [TJ 
The first one treats the case where the number of covariance 
lags goes to infinity, while the second, treats the case where 
the values of the covariance lags tend to those of a singular 
sequence. In both cases the diameter of the uncertainty set 
necessarily goes to zero for a weakly continuous metric. 

Corollary 3: Let c be a non-negative sequence and let S be 
a weakly continuous metric. Then 

Ps(P C0:n ) ->• 0, as n -4 oo. 

Proof: This follows directly from Theorem Q] and by 
noting that 



•Py cn „ ^— P ci 



3 1 . 16] in view of 



for any e > 0. It also follows from 
Proposition [TOl in Section HV-CI below. ■ 
Corollary 4: Let Co :n be a vector of covariance lags such 
that the corresponding T„ is a singular Toeplitz matrix, and let 
Co:n(^) (k = 1,2, . . .) be a sequence of vectors of covariance 
lags tending to c 0:n . If 5 is a weakly continuous metric then 

Ps(Pc„ : „(k)) 0, as k -> oo. 

Proof: Follows directly from Theorem Q] See also ITSTII 
for an independent detailed argument. ■ 
Remark 5: It should noted that the total variation ( J \d/j,o — 
d/j,i\) is not weakly continuous and therefore the conclusions 
of the two corollaries would fail if this was used as the 
metric. To see this, note that if c 0: „ is positive, then !F CQ . n 
contains infinitely many measures and among them at least 
two singular measures with non-overlapping support, i.e., 
supp(d/io) H supp(dyUi) = (e.g., see (35]). Then the total 
variation of their difference is always 2cq. □ 

IV. Weakly continuous metrics 

In general, a finite set of second-order statistics cannot 
dictate the precise value of the power spectrum locally. Indeed, 
given any finite positive sequence co :ra and any 6q 6 (— tt,tv], 
then for any value a > there exists an e > and an 
absolutely continuous measure d/.i = fdd G J- Co . n such that 

f(6) = a for 6 e(6 -e,6 + e). 
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What can be said instead, is that the range of values 



gd[i : djjL G F Co ., 



(5) 



for any particular test function g G C(T), is bounded. 
Furthermore, as n — > oo, this range tends to zero. In fact, due 
to weak continuity, the range of values tend to zero for any of 
the scenarios in Theorem[TJand its two corollaries. Finding the 
maximum and the minimum of (O is a linear programming 
problem on an infinite dimensional domain. Provided g is 
symmetric real and the covariance sequence Co : „ is real, the 
dual problems, which give the lower and upper bounds of ©, 



max Ac 0: „ T : ^ A fe cos(fc6») < g(9),8 G {-n,ir] I , (6) 

I k=0 J 

min J Ac 0: „ T : g{9) < ^ X k cos(fc<9), 9 G (-tt, tt] 1 , (7) 



fc=0 



where A = (Ao, Ai, . . . , A„) are Lagrange multipliers. 

Remark 6: Along these lines Lang and Marzetta in |36], 
ll37l sought to quantify the maximal and minimal spectral mass 
in a specified interval given the covariances co :n - To this end 
we may take g = \i the characteristic function of an interval 
/, that is, x/(0) = 1 if 9 G / and otherwise. Lower and upper 
bounds on Jj d/j, are finite and are then given by (|6]l and (0, 
respectively. However, since g = \i is not continuous, the 
mass in an interval is not a weakly continuous quantity, and 
the requirements in Corollary [3] does not hold. In fact, for this 
case the gap between the upper and lower bound does not 
necessarily converge to zero as n goes to infinity. This occurs, 
e.g., in the case when the true spectrum has a spectral line at 
an end point of the interval. □ 

A class of weakly continuous metrics can be sought in the 
form 



S(dfio, d^i) = sup 



gt(d(M) - dm) 



(8) 



for {g^}^eK C C(T), provided the family {g^}^K of test 
functions is sufficiently rich to distinguish between measures 
and yet, small enough so that continuity is ensured. The precise 
conditions are given next. 

Proposition 7: The functional S(dfJ,o, d/ii) defined in ^ is 
a weakly continuous metric if and only if the following two 
conditions hold: 

(a) for any two measures d/J,o, dfii G 971, there is a £ G K 
such that J T g^d^Q ^ Lg^dfAi, and 

(b) the set {g^}^eK in C(T) is equicontinuou^ and uni- 
formly bounded. 

Proof: See the appendix. ■ 
In essence, condition (a) ensures positivity while condition (b) 
ensures weak continuity. The triangle inequality and symmetry 
always hold for such S. The total variation norm is an example 
of why (b) is needed — it is a norm of the form ((HJ) where the 
set of test function are the C(T) unit ball, {g : ||<?||oo < 1}, 
but it is not weakly continuous. This is due to the fact that the 
unit ball in C(T) is not equicontinuous. 

2 A family of functions {g^lfgif C C(T) is said to be equicontinuous if 
for any e there exists a 7 such that \g^(8i) — g^{02)\ < t if \0\ — 62 1 < 7 
for all 8 1 , 6» 2 £ T, and £ e K. 



Remark I 
form 

6(dp, ,dfj,i 



A more general family of distances are of the 



sup 

go(0) e K ,gi(<j>) e Ki, 
SO (0) + Si (40 6 K 



godfio 



gidni 



where K Q , K x C C(T) and K c C(T x T). By selecting the 
sets Kq, Ki, and K properly, 5 (or a monotone function of 5) 
will be a weakly continuous metric. One such example is the 
metrics based on optimal transportation treated in |[T9l . where 
the metrics have non-local properties such as geodesies which 
preserve lumpedness. □ 
Next we consider three ways for devising weakly continuous 
metrics. The first uses smoothing of power spectra to be 
compared by suitable test functions in a way that is analo- 
gous to the use of classical window kernels in periodogram 
estimation BH1 . The second is based on Monge-Kantorovich 
optimal mass transportation where a cost is associated with 
mismatch in the frequency range where power resides. In 
this geometry, optimal-transport geodesies may be used to 
model slow time-varying drift in the spectral power of non- 
stationary time-series lfl9l — such models for non-stationarity 
lessen artifacts present when using ordinary interpolation (e.g., 
fade-in fade-out l30l ). The third is based on Poisson kernels 
and is more suitable for differentiating spectra based on their 
content on specified frequency bands. The connection between 
Poisson kernels and the analytic and harmonic functions in (ffjl 
and (0 allows for evaluating bounds and the diameter of the 
uncertainty set with respect to the corresponding distances. 
This will be explored in the case where finitely many error- 
free covariances are known in Sections [V] to IVIIII 

A. Metrics based on smoothing 

A simple way to devise weakly continuous metrics which 
has a classical flavor is to first smoothing the measures via 
convolution with a fixed suitable continuous function, and 
then to compare the smoothed spectral densities. This echoes 
the use of windowing Fourier techniques in the time domain 
ll48l where a suitable choice of a window is used to trade-off 
resolution and variance of the estimator. Likewise here, the 
choice of a windowing function determines the resolution of 
the metric. 

Thus, let g G C(T) be such a windowing function, and 
define 



^smooth, g 



(dfi ,dni 



Here, 



(ff*d/i)(0 



\\g * (duo - dm) 



g(£ - 6)d f i(6) 



denotes the circular convolution and || • the norm. In 
the view of Proposition [7] 5 smo oth,g is of the form 



\\g*(dii -dni)\\oo = sup 

£e(-7T,7r] 



g(Z-0)(diAo(0)-dni(0)) 



and hence, condition (b) of the proposition holds. In addition, 
the chosen convolution-kernel functions must not have any 
zero Fourier coefficient, otherwise the approach will fail to 
differentiate between certain measures. To see this, let g(9) = 



5 



J2T=-oo 9k& and let (. . . , a_i, a , , m, ■ ■ ■) be the Fourier 
coefficients of dp,o(6) — dfj,i(6), then 

oo 

g * (d^o - dm)(0 = 51 9-ka k e lk ^. 

h— — oo 

If fffc 7^ for all fc G Z, the above expression cannot vanish 
identically unless all the afe's are zero, in which case dfio = 
dm- In this case (a) holds and it follows from Proposition [7j 
that 5 smoo th, g(d Ho, dm) is a weakly continuous metric. This 
leads to the next proposition. 

Proposition 9: Let g G C(T) be a windowing function with 
non-vanishing Fourier coefficients. Then S smoo th,g{diJ,o, dm) 
is a weakly continuous metric. 



Z?. Metrics based on optimal transportation 

A rapidly growing literature [50] on a classical problem, 
known as the Monge-Kantorovich transportation problem, has 
impacted a wide range of disciplines, from probability theory 
to fluid dynamics and economy |42l . Optimal transportation 
refers to the correspondence between distributions of masses 
that induce the least amount of transportation cos{| The opti- 
mal transportation cost between two probability distributions 
induces weakly continuous metrics, known as Wasserstein 
metrics, which are extensively used in probability theory. In 
order to handle more general distributions we need a suitable 
modification to compare unequal masses. This we do next and 
connect with the formalism in (©. 

The Monge-Kantorovich transportation problem amounts to 
minimizing the cost of transportation between two distribu- 
tions of equal mass, e.g., dfio and dm where J T d/j,o = J T d\i\. 
In this, a transportation plan dtv(0, <f>) is sought which corre- 
sponds to a non-negative distribution on T x T and is such 
that 



dir{e,4>) = d^ {4>) and / dTr(6,(f>) = dm{6)- (9) 

Then, the minimal cost 

min< / \6 — 4>\dir(6,(j)) : dn satisfies (O 

is the Wasserstein- 1 distance between d/.iQ and dm< and is 
a weakly continuous metric (see, e.g., 11501 chapter 7]). This 
problem admits a dual formulation, known as the Kantorovich 
duality: 



max / g(d^ ~ dm), 
Il9lk<i . 



Wi(d/i ,diii 

where ||/||l = su Pe ^ ^jel^f*^ denotes the Lipschitz norm. 

Power spectra, in general, cannot be expected to have the 
same total mass. In this case, Si jK {d^, ,dm) defined by 

inf Wi (dvp , dvi) + k ^ / \dm — dui\, (10) 

fdu =Jd,U! i=o^ 

is a weakly continuous metric for an arbitrary but fixed k > 0. 
The interpretation is that d/j, and dm are perturbations of 
the two underlying measures dv§ and di/i, respectively, which 

3 L. Kantorovich received the 1975 Nobel Prize for the impact of this theory 
on allocation of economic resources. 



have equal mass. Then, the cost of transporting d/j, and dfii 
to one another can be thought of as the cost of transporting 
dvo and dvi, to one another, plus the size of their respective 
perturbations from d/io and dfii. This is introduced in |fl9l 
and this metric admits a dual formulation 



Si >K (dfi Q ,dfii 



max 

Mloo < « 

\\9\\l<1 



g(d^ - dm), 



which is in the form of the Proposition [7] Various other 
generalizations of the transportation distance that apply to 
power spectra are also being proposed and studied in Hl9j . 

C. Metrics based on the Poisson kernel 

Power spectra are weak limits of the real part of analytic 
functions on the unit disc, as indicated earlier. Comparison 
of these functions induces weakly continuous metrics which 
readily fall under the framework of (8JI. Interestingly, this 
approach allows for both the computation of explicit/analytic 
bounds on uncertainty sets (see Section fVb and for specifying 
a frequency dependent resolution of a metric (see Remark [TT] 
and the example in Section [VTB . 

Recall from Section|II]that the harmonic function associated 
with a measure is the Poisson integral, defined as 



p[dn]{ Z ) =i- r p r 

27T J-n 



(t - 9)dn{0), z = re} 



Weak convergence of measures is equivalent to certain types 
of convergence of their harmonic counterpart. 

Proposition 10: Let {dm}k^i a sequence of uniformly 
bounded signed measures on T, let d/i be a bounded measure 
on T, and let u(z) = P[dfj](z), Uk(z) = P[dp,k](z) be their 
corresponding Poisson integrals. The following statements are 
equivalent: 

(a) d/ik —> d\i weakly, 

(b) Uk(z) —> u(z) pointwise Vz G ID, 

(c) u k (z) — s- u(z) in Li{U), 

(d) Uk(z) — > u(z) uniformly on every compact subset of H5. 
Proof: The proof is given in the appendix. ■ 

Each of the statements (6),(c), and (d) may be used for 
devising weakly continuous metrics. We shall focus on the 
statement (d), indicating that weakly continuous metrics can 
be constructed by comparing the harmonic functions on a sub- 
set of D. In fact, the maximal distance between the harmonic 
functions on a closed non-finite set K C B gives rise to a 
weakly continuous metric 



S K (d/j ,dm) 



max \ P(d(io 



dm)(z)\. 



(ii) 



This is true, since the resulting family of the Poisson kernels 
satisfies the properties in Proposition [Vj To see this, first note 
that any two harmonic functions which coincides on K, a 
closed non-finite set inside D, must be identical, hence (a) is 
satisfied. Further more, since K C 7O for some 7 < 1, the 
magnitude and derivative of P r (t — 8) is uniformly bounded 
when re' lt G K, hence (6) holds. 

Remark 11: In practice, it is often the case that one is 
interested in comparing spectra over selected frequency bands. 
To this end, various schemes have been considered which 
rely on pre-processing with a choice of "weighting" filters 
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and filter banks (see e.g., g), (49), and |6), |[[6)). The 
choice of the point-set K in (fTTT i can be used to dictate 
the resolution of the metric over such frequency bands. To 
see how this can be done, consider K to designate an arc 
{£ = re 10 : 9 £ [9q — e , $o + This satisfies the conditions 
of Proposition [7] and thus, 8 k is a weakly continuous metric. 
At the same time, the values P[dp](£), with £ £ K, represent 
the variance at the output of a filter with transfer function 
z/(z — £). These are bandpass filters with a center frequency 
arg(£) and bandwidth which depends on the choice of r. Thus, 
in essence, the metric compares the respective variance after 
the spectra have been weighted by a continuum (for £ £ K) 
of such frequency-selective bank of filters. □ 

V. The size of the uncertainty set 

The diameter of the uncertainty set with respect to the 
distance 8k turns out to be especially easy to compute - it is 
realized as the distance between two "diametrically opposite" 
measures with only n + 1 spectral lines each (i.e., measures 
having compact support). This is the content of the following 
proposition. 

Proposition 12: Let Co :rl be a positive covariance sequence 
and let K C ID be closed. Then 



max < 

zeK 



(b„d 



z/T- 1 



(b z ,b z ) 



T -i 



(d z ,d z ) T -i 
(b z ,b z ) T -i 



where 



z 



I z 

z~ 



d, = 



2 (c<H 



2dz) 



\ 



l (co + 2c lZ + --- + 2c n z n )J 



and (x,y)j<-i denotes the inner product 

(x,y) T -i := y*T~ 1 x. 

Furthermore, p& K (T CQ . n ) is attained as the distance between 
two elements of T Co . n which are both singular with support 
containing at most n + 1 points. 

Proof: The proof is given in the appendix. ■ 
Both claims in Proposition Q~2] can be used separately 
for computing ps K (-^con.)- The first one suggests finding 
a maximum of a real-valued function over K. The second 
claim suggests a search for a maximum of 8k (dpi, dp%) 
over a rather small subset of ext(J r Co ?i ), namely nonnegative 
sequences cWn+i) parametrized by c n+ i, i.e., solutions of the 
quadratic equation 



det(T rH 



= 0. 



(12) 



The (complex) values for c n +\ satisfying ( fT2l , lie on a circle 
in the complex plane, and hence, computation of P8 K (Fc . n ) 
requires search on a torus (each of the two extremal dpi, dpi 
where the diameter is attained can be thought of as points on 
the circle). 

We elucidate this with an example. Figure [TJ shows 



PS K (-^c, 



for 



CQ:2 = (1, Cx, C 2 ) 




Second Schur parameter 



First Schur parameter 



Fig. 1. The uncertainty diameter 
and K = {z : \z\ < 0.5}. 



ps K as a function of 71 , 72 when cq 



as a function of the corresponding partial autocorrelation co- 
efficients, also known as Schur parameters (see the appendix), 



-1 < 



-1 < 



7i 



del 



72 := 



: Ci 

Cl c 2 
ci 



1 



det 



1 

Cl 



Cl 
1 



< 1, 



< 1, 



and K is taken as {z : \z\ < 0.5} C D. 

The plot confirms that the diameter decreases to zero as the 
parameters or, alternatively, the covariances c\ and C2, tend 
to the boundary of the "positive" region (which in the Schur 
coordinates corresponds to the unit square). However, it is 
interesting to note that the diameter of T CQ . n as a function of 
com has several local maxima. This maximal diameter may 
be explicitly calculated, hence provides an a priori bound on 
the uncertainty. 

Theorem 13: Let r = max(|z| : z £ K). Then 



4c |r| 



n+l 



1 



(13) 



holds with equality if and only 

,coa n ) for some a £ K with \a 



if c 0: . 



Further, ( TOl i 

(co,coa,coa 2 , 

Proof: The proof is given in the appendix. ■ 
Remark 14: Computation of the diameter jOtf(-Fco !n ) °f me 
uncertainty set amounts to solving the infinite-dimensional 
optimization problem 



swp{8(dni,dn 2 ) : d\xx,d\x 2 e^cotn}- 



(14) 



If 8 is a weakly continuous and jointly convex function, then 
the diameter is attained as the precise distance between two 
elements which are extreme points J- CO n - Extreme points are 
the points with the property that they themselves are not a 
convex combination of other elements in the set; the set of 
extreme points is denoted by ext(-). Then, dp £ cxt(J r CO ji ) 
if and only if dp £ J- Co . n and the support of dp consists of 
at most 2n + 1 points (see PD ). Thus, ext(J r Co ?i ) admits a 
finite dimensional characterization and (Tl4l reduces to a finite 
dimensional problem. □ 
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Fig. 2. The "true" power spectrum dv. 



Fig. 3. Subplot 1: The power spectrum dv (solid), dfj,^ (dashed). Subplot 2: 
P[df4(0.9e ie ) (solid), P[dfj, 5 ](0.9e iB ) (dashed), along with bounds based 

On C :5. 



VI. Identification in a weak sense 

In this section we elucidate how the uncertainty set is 
affected by the number of moments and show that spectra may 
be close in the weak sense even though they are qualitatively 
very different. 

Consider the stochastic process 

y t = cos(0.5i + <pi) + cos(t + ip 2 ) + w t + 7^ w t-l 

where w t is a white noise process and ipi , cp2 are random 
variables with uniform distribution on (— tt, it]. The power 
spectrum dv is depicted in Figure [2] and the spectrum has 
both an absolutely continuous part as well as a singular part. 
We would like to identify this spectrum relying on covariance 
data and derive bounds on the estimation error. We will use 
the metric 8k where K = {z : \z\ = 0.9}, i.e., 

5 K (dfj, ,d(j,i) = sup \P(dfio - dfii)(z)\. 

\z\=0.9 

Let c be the covariance sequence of dv and let efyig and 
dfi20 be the power spectra with highest entropy in the sets 
T CQ . 5 and J*c . 20 , respectively. Figure [3] compares d/i5 and dv 
where the estimation error and the uncertainty diameter are 

6 K (dv,dn 5 ) = 5.66, ps K {^ Ca:5 ) = 20.79. 

The first subplot shows and compares these two power spectra. 
The second subplot displays P[dv](0.9e w ), P[d/z 5 ](0.9e ?;e ), 
along with bounds on P[dfi] (0.9e ie ) when [i £ P Co . 5 . It is seen 
that the spectrum d[i$ does not distinguish the two peaks. In 
order to distinguish the two spectral lines, the information in 
Co:5 is clearly not sufficient as the (^--bounds are substantial. 

Figure |4] now compares d^2o and dv in a similar manner. 
The estimation error and the uncertainty diameter are 

5 K (dv,dn 2 o) = 0.29, ps K {F C0:2a ) = 2.52. 

Here, d/i20 has two peaks close to the spectral lines and 
P[d/z 20 ](0.9e je ) resembles P[dv}(0.9e l6 ) quite closely. In 
fact, the bounds/envelops already reflect the presence of the 
two peaks. 

To amplify the point made above, consider dfi\i nc to be the 
(unique) power spectrum in P C(y20 having Schur parameter 



721 = 1; this corresponds to a deterministic process (having 
only spectral lines) and is depicted in Figure [5] Subplot 2 
shows P[d^ii no ](0.9e l61 ) and how it "sits" within the respective 
bounds. In the absence of additional information, d/i\i ne , 
d/120, or any other power spectrum in .F C0 . 20 is admissible. 
The "worst case" distance between any two is the diameter 
computed above. 

Remark 15: Even though the three spectra dv, d/i2o, and 
d/iiino, have identical covariances Co : 2o, they are quite different 
in terms of their respective singular and continuous parts. 
However, they are similar in their distribution of spectral-mass 
- they have most of their mass located around the frequency 
points 6 = 0.5 and = 1, and this is what the weak topology 
captures. 

Remark 16: Standard pointwise distances between dv, 
d/120, and d^ii ne do not provide a meaningful comparison. For 
instance, the Itakura-Saito distance l23l . the Kullback-Leibler 
divergence l20l , and the Cepstral distance ||24| . because they 
contain a logarithmic term, give the value of oo when com- 
paring d/j.20 and dfiy inc . On the other hand, the L2 metric 
does not apply to the present context because spectral lines 
cannot be viewed as "L2 functions" and if approximated the 
norm diverges to infinity. Finally, the total variation does not 
differentiate when spectral lines are nearby or far apart (c.f., 
Remark |5). 

VII. Generalized statistics 

Our analysis extends readily to the case of generalized 
statistics Q, 03], 0, fl4l . The formalism in these references, 
nicknamed THREE (for "tunable high resolution estimation") 
allows for the possibility of tunable filter-banks and was 
shown to provide improved resolution, albeit, quantitative 
assessments of the benefits exist only in special cases Q. 
We briefly sketch the formalism here, for lack of space, and 
we refer to the aforementioned references for more detailed 
accounts. 

We explain the formalism of generalized statistics in the 
setting of "filter-banks", i.e., we consider the stochastic pro- 



X 



' i 




0.5 



Fig. 4. Subplot 1: Power spectrum dv (solid), d/j,20 (dashed). Subplot 2: 
P[dn]{O.9e i0 ) for the true spectrum (solid), P[d/i 2 o](0.9e ie ) for d^o 
(dashed), along with bounds based on cq : 2o- 




Fig. 5. Subplot 1: Power spectrum a!/ii; nc (line spectrum). Subplot 2: 
P[d[i\i nc ] (0.9e* e ) for the line spectrum, along with bounds based on cq : 20- 



cess yt as driving a bank of first-order dynamical systems with 
transfer functions 



-, for k = 0, 1, . . . , n, with \z k \ < 1 



G k {z) := - 

Z-Zk 

as shown in Figure [6] The joint covariance matrix of the filter- 
bank outputs is 

P = E{u(t)u(t)*}, 

where u := (uo(t), Ui(t), . . . , u n (t)) T . As indicated earlier 
t £ Z is the time index. The covariance matrix takes the form 



G {z) 



Gi{z) 



u 



G n (z) 



u, 



of a Pick matrix 



where 



P := 



w k + w e 



1 - z k z e 



(15) 



k.e=o 



w k = -{l-zl)£{ul} 

(see Q Equations (2.8), (2.10)] and H21 page 783, Equation 
(7)]). The matrix P replaces the ordinary Toeplitz covariance 
in the previous sections. Certain observations are in place: 
given the filter-bank dynamics, i.e., the z k 's, i) P depends only 
on the values w k , and ii) the cross-covariances between filter- 
bank elements can be computed from the output covariances 
of all elements individually, that is, from the w k 's. 

A rather complete theory has been developed to characterize 
power spectra for the input process that are consistent with 
output-covariance (more generally, state-covariance) statistics. 
This theory provides among other things a construction of the 
unique input spectrum of maximal entropy, spectral envelops 
that are reminiscent of the Capon pseudo-spectra, and the 
identification of spectral lines with techniques analogous to 
the theory of the Pisarenko Harmonic Decomposition, MUSIC, 
ESPRIT, etc., and has been worked out in detail for matrix- 
valued power spectra as well (see e.g., IfTSI , Ifl6l , 1(171 , IfTSl , 

ma, mi). 

We restrict our attention to the present setting where {yt}tez 
is scalar as before and so are the filters. We assume estimates 
for the output covariances, hence, the values Wfc's. Like before, 
we now denote by J- Z . w the family of power spectra for the 
process {j/tjtgz which are consistent with these values and we 
are interested in assessing the size of this family as a measure 
of our spectral uncertainty. 

The following proposition can be derived almost verbatim 
as Proposition [12] See ll32ll for an independent proof. 

Proposition 1 7: Let zq , . . . , z n and wq , ■ • ■ , w n be such that 
the Pick matrix P in (TT3T > is positive and let K C D be closed. 
Then 



max 



where 



+ (b z ,d 



z,U, z )p-l 



z/P- 



(d z ,d z ) P -i 
(bz,b z ) P -i 



( \-ZQZ \ 



1 — Z\Z 



( 



1-ZQZ 



\ 

) 



Fig. 6. Bank of filters. 



V l-z n z I \ 1- 

and (x, y)p-i denote the inner product 

(x,y) P -i := y*P~ 1 x. 



As before, ps K {^Pz.w) is attained as the distance between 
two elements of J ! W which are both singular with support 
containing at most n + 1 points. 

As in the covariance case a priori bounds on the uncertainty 
may be calculated. 
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Theorem 18: Following the notation of Proposition [17] let 

zq = and 

n 

*c*)=nf~ 



ZfcZ 



Then 



PSk(^, w ) < max 



4w |B z (z) 



(16) 



A' := {0.65 



,±0.5i 



0.25T}. 



z&K 1 - \z\ 2 

Further, ( fT6l holds with equality if and only if 

w k = w (l + z k a)/(l - z k a) for fc = 1, . . . ,n 

for some a£J( maximizing |B z (a)|/(l — |a| 2 ). 

Proof: The proof is given in the appendix. ■ 
Here the a priori bound depends on the interpolation points z, 
in addition to K, the model order n, and the total spectral mass 
wo. Therefore, by minimizing the right hand side of ( TToT ) with 
respect to the z, one can find the filter bank with the smallest 
a priori uncertainty in the metric 8k- This will be exploited 
in the following example to tune the filter-bank poles. 

VIII. Uncertainty in the THREE framework with 

OPTIMAL FILTER SELECTION 

From this vantage point we now take up an example 
as before, with closely spaced sinusoids, and compare two 
alternative formalisms, one based on Toeplitz covariances and 
the other based on generalized statistics. 

Consider the stochastic process 

cos(0.5i + tpi) + cos(0.6i + W2) , s 1 
Vt = 1LJ -^ " —+cos(t+^ 3 )+w t + -w t - 

with two closely-spaced spectral lines at 0.5 rad/s and 0.6 rad/s 
superimposed with a spectral line in 1 and colored noise. We 
choose as metric 8k, with K C D proximal to the region 
where high resolution is desired - i.e., near 0.5 rad/s where 
the two closely-spaced sinusoids reside. More specifically, we 
takd3 



, 1 

1 1 5 
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True spectrum 

- — Covarinace based estimate 
TH REE-estimate 
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Fig. 7. True spectrum du (solid) and estimated spectra c^Mthree (dashed- 
dotted) and d^ME (dashed). 




Fig. 8. Set K (solid red) and points zj. (x in blue). 



This is depicted by the two circles in Figure [8] 

We compare the maximum entropy spectral estimate c^me 
constructed using the covariances Co, c\, . . . , C20, with the 
spectral estimate dp THREE which is based on the output 
statistics of the filter bank of Gk(z)'s, We select n = 10 and 
filter-bank poles that minimiz^f] the a priori uncertainty bound 
(116t . The filter poles, indicated by "x" in Figure [8] are 

z k e{0, 0.581 ±0.480i, 0.681 ± 0.470i, 

0.738 ±0.422i, 0.755 ± 0.27H, 0.765 ± 0.357i}- 

The THREE-spectrum is a "maximum entropy" distribution 
which is now consistent with statistics other than the usual 
autocorrelation ones (dp THHEB is the so called "central solu- 
tion" of the Nevanlinna-Pick analytic interpolation theor>0 to 
distributions in J- ZyVr ). 

4 T denotes as before the unit circle. 

5 The pole 20 = and total spectral mass wo = 1 are assumed fixe d. 
The bound in j\6\ is then minimized over z±, . . . , z n . Since RHS of H<>\ is 
nonconvex in z only local minimum is guaranteed. 

Software is available at 

|http : / /www . ece . umn . edu/ ~georgiou/ code/ spec_analysis . t 



The a priori bounds on the uncertainty provided by Theo- 
rems Qj] and Q~8] are 

A$k(-^,w) < 0.151w = 0.468 and 
fc(^J < 2.304 c =7.167, 

respectively. In our example wq = c — 28/9. This shows that 
the a priori bound on the uncertainty set with respect to 8k is 
considerably smaller when the THREE formalism is applied. 

The two spectral estimates together with the true power 
spectrum are depicted in Figure [7] It can be seen that the two 
closely-spaced lines are not discernible in c?me- On the other 
hand, they are quite clearly distinguishable via THREE. This 
is due to the choice of the dynamics z. As can be seen from 
the figure, the resolution of dp THIlEE is substantially higher 
than that of dp,ME m the vicinity of 0.5 rad/s. We would also 
like to compare the size of the uncertainty set for the two 
scenarios. The size of the respective diameters are 

Pfe^w) = 0.194 and p6 K (J 7 c .J = 2.831. 

Thus, when measured using 8k, the uncertainty set using the 
THREE formalism is considerably smaller. Figure [9] displays 
the Poisson integral of the true power spectrum evaluate on 
ta\ , and the corresponding boundsQ 
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True spectrum 

THREE Bound 





Fig. 9. Bounds on estimates on K based on covariances (top) and the THREE 
formalism (bottom), respectively. 



IX. Conclusions and future directions 

The choice of a metric is key to any quantitative scientific 
theory. Identification of power spectra is often based on 
second-order statistics (moments), and therefore, it is natural 
to metrize the space of power spectra in a way that respects 
continuity of moments. There is a variety of such weakly 
continuous metrics — metrics which localize "spectral mass". 
We presented various choices and focused on a particular 
metric, 5k, which is amenable to quantifying the size of the 
uncertainty set. We envision that this, and similar metrics, can 
be used as tools for assessing uncertainty and robustness in 
modeling and spectral analysis. We further expect that the 
theory will be of use in filter design and in quantifying the 
notion of resolution-as this is naturally connected to the size 
of the spectral uncertainty set. Finally, we expect that these 
metrics will conform with other subjective measures rooted in 
perceptual qualities of signals (cf. |fl9l Example 10]). 

Interest in weak continuity is not new. Indeed, a classical 
weakly continuous metric is the Levy-Prokhorov metric BP 
and it is well known that the periodogram converges weakly 
as the sample size goes to infinity (see, e.g., [39)). Yet, 
appropriate weakly continuous metrics that can be used to 
quantify uncertainty have not received much attention — 
the commonly used "total variation," Itakura-Saito, and other 
distance measures are not weakly continuous. Besides the 
relevance in uncertainty quantification and in filter design 
(cf. Section [Villi ), computationally amenable and easy-to-use 
metrics may provide a useful geometric setting for modeling 
slowly time-varying processes and for integrating data from 
disparate sources (see, e.g., (28), ED, ED, ED, ED)- 

Appendix 

Proof: [Theorem \Tjj 
The canonical neighborhood basis for a point dv in the weak 
topology on 9Jt consists of sets of the type 



N(du,{g k }l =1 ,e) 



where g k are continuous functions on T for k = 0, . . . , n. To 
establish the theorem we prove that the neighbourhood basis 



K{dv)={N{du,{g u }l =0 ,e 
is equivalent to the basis 

^{dv) = \ J c „ : „, e 



>0,n6N,WL o cC(T)} 



e > 0,n £ N,c k = j z 

T 



^dv, k = 0, . . . , n 



First note that Vl(dv) D 3(dv), and hence the weak topology 
is at least as strong as the topology induced by $(dv). To 
establish the other direction, let N be an arbitrary set in ^(dv). 
To show the equivalence, it is enough to show that there exists 



n £ N such that T CQ . n>n -i C N. 



Let 8 be a weakly continuous metric for 9Jt and choose e so 
that the B s (dv,e) = {dfx > : 8(dfi,dv) < e} C N. Next, 
take dfii £ Fcg.^i-i with 



5(dfj,g,dv) > — sup{<5(d/!, dv) 



*e7- w -i} (17) 



for I > 1. Since T CQ . a -i C {dp : /x(T) < v(T) + 1), 
which is weakly compact, there is a convergent subsequence of 
d[ik that converges to dfi (by Banach-Alaoglu B31 ). Note that 

F CQ:t ,i-^ 3 closurclJco^+i.^+i)- 1 ) D .Fco^+i^+i)- 1 . hence 
dfi £ J-" Co , t £-i for any £. It then follows that dfi = dv since 
the trigonometric moment problem is determinate (by Riesz- 
Herglotz, see jT|). Let n be such that <5(<i/i n , dv) < e/2, then 
by ([T7li we have that F C(yn „-i C B s (dv,e) C N. We have 
thus shown that the topology induced by the neighbourhood 
basis $(dv) is the weak topology, and hence S is weakly 
continuous if and only if © holds. ■ 

Proof: [Proposition [7^ 
It is clear that condition (a) holds if and only if 8(dfio,dfii) 
is positive whenever dfio ^ dfii. The triangle inequality and 
symmetry always holds for such 5, so we only need to show 
that condition (b) holds if and only if 6 is weakly continuous. 

We will show that condition (b) implies that 6 is weakly 
continuous by contradiction. Assume therefore that condition 
(b) holds, but that S is not weakly continuous. Then there exists 
d/j,k —> dfj, weakly such that S(dnk,d/j,) > e, k = 1,2,..., 
and hence there exists g^ k , ^ G K, such that 



e < 



an* - dp) 



1,2, 



To this end we use the Arzela-Ascoli theorem (see e.g., 
page 102]) which states that a set of functions is relatively 
compacQ in C(T) if and only if the set of functions is 
uniformly bounded and equicontinuous. Therefore, since (b) 
holds, the set {g^}^ & K is relatively compact in C(T), and 
there is a subsequence {gg^dug) of (g£ k ,d/ik) such that 
ge — > g £ C(T). A contradiction follows, since 



< / ge(dfjLi - d/i) 



> 



g k {dv - rf/j) 



< e, k = 0, 1, . . . ,n |, 



< ht - fflloo / Wi -dn\ + 

— > as I — > oo, 



g{dfj, e - dfj,) 



and hence 8 is weakly continuous whenever condition (b) 

7 Since K is formed out of two circles symmetrically located with respect to holds, 
the real axis of the complex plane, plots are identical for the two components 

of K. 8 A set is relatively compact if its closure is compact. 
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Next, we show that (b) holds if 6 is weakly continuous, and 
once again we use contradiction. That is, we show that if (b) 
fails to be true then 5 is not weakly continuous. If {g^^x 
is not equicontinuous, then there exists an e > such that for 
any k = 1, 2, . . . one can find 9 k , 4>k G T, and G K, that 
satisfies 



\°k - 4>k\ < r and ba(<?fc) - 9t k 



(18) 



Let <^) be a subsequence of (9k, 4>k) such that 9( ^ 9q G 
T as ^ — > oo, and let and dvg be the measures that consist 
of a unit mass in 9i and 0^, respectively. From ( TT8l it follows 
that 0£ — > ^o, and hence that dfn — > dfio and dvg — > <i/xo 
weakly, where d/^o is the measure that consists of a unit mass 
in 6*o- From ( TT8l it follows that 

S(dpi,dp ) + 5(dvi,dp ) > S(dfi£,dve) 

> Iff^(^) -5^(^)1 > e - 

From this, it is evident that S is not weakly continuous since 
both S(dp£,dpo) and 8(dv£,dp^) cannot converge to 0. 

Similarly, if {g^}^eK is not uniformly bounded, then for 
any k = 1, 2, . . . one can find 9k £ T and £ A such that 

lfl&(0k)l>fc- d9) 

Let (i/Zfe be the measures that consist of a unit mass in 
9k- Therefore, the metric S is not weakly continuous since 
\d\Xk -> weakly, while <5(^d/^,0) > 1 for all k. ■ 

Proof: [Proposition 1701/ 
(a) =>■ (fo) /ifc — !> weakly is equivalent to J_ f(t)dp k (t) — > 
Jl^ /W^MO f° r a U periodic continuous functions /(f). For 
all z = re l9 £ D, P r (0 — t) is periodic and continuous, hence 



u k{z) = — 

Z7T 



p r (e-t)dfx k (t) 

— t)dfx(t) = u(z). 



(b) => (c). For r < 1, K(re lS )| < ±±^ fc |(T). Since 
u k( r e te ) — )• u(re ) pointwise for all 9, it follows from 



bounded convergence that \l \u k {re l °) - u(re l0 )\d9 -> 0. 
Further more, Vfc, r, /J K(re ie )-u(re lf, )|d<9 < 27r(|/i fe |(T)+ 
|/z|(T)) which is uniformly bounded, hence 



\u k (re lt> ) -u(re lli )\d9rdr 

by dominated convergence. 

(c) =>• (d). Let A' C D be a compact set. Then there exist 
an e > such that B e (z ) = {z : \z - z \ < e} C D for 
all zo £ X. Now by the mean value property of harmonic 
functions we have 

2 r 

u ( z o) = -J / u(z )rdr 
e Jo 

= -5 / / + re l9 )d9rdr 

u(z)dxdy. 

Of course the same equality holds for itfc(zo) 
Ufc(2o) = — 7 / u k (z)dxdy. 

m JB e (z ) 



B e (z ) 



For any zo £ A' the difference between the harmonic functions 
is bounded by 

|u fc (z ) - u(z )| < ""T / \uk(z) - u(z)\dxdy 
ne Jb c (z q ) 

< — 7T / \uk{z) - u(z)\dxdy. 
Tre Jb 

By (c) the difference goes to zero uniformly in K. 

(d) => (a). Let / £ C(T). For any bounded measure v G T 
and corresponding harmonic function v(z) = P[v](z) Fubini's 
theorem gives 

/7T /*7T -i i>7T 

f(t)v(re u )dt = J —J P r (9-t)f(t)dtdv(9) 

P[f(t)dt](re ie )dv(9). 



Since / is periodic and continuous, P[f (t)dt](re %e ) converges 
uniformly to f(9), hence 



f(t)v(re lt )dt- / f(t)dv{t) 



< 



\\P[f](re^-f(t)\\ooW\(T) 

converges to zero independent of the measure v. This shows 
that for an arbitrary e > there exists an < r < 1 such that 



f(t)v(re lt )dt- / f(t)dv{t) 



< 



for v G {fJ,, fii, fi2, ■ ■ •}■ Further more, since Uk — > u 
uniformly on {z : \z\ < r}, it is possible to find an k re 
be such that 

f{t)u k {re lt )dt- ^ f(t)u(re u )dt < \ 
J -it 

for all k > k re . By the triangle inequality we have 



f(t)dn k {t) 



< e 



for all k > k r>e . Since e was chosen arbitrarily, 

W f(t)d(j,k(t) - fin f(t)dfj,(t) -> as k -> oo, and weak 
convergence follows. ■ 

Proof: [Proposition \12\l 
There exists an analytic function f(z) = H[dp](z),dfi G 
•F Co . n , such that f(z) = w z if and only if its associated Pick 
matrix is nonnegative [34), i.e. 



9T 

iJ n 

w z b* z — d* z 



b z w z - 

W z -\-W z 



> 0. 



(20) 



1 — zz 

By using Schur's lemma and completing the square, we arrive 
at 

2 



< 



1 — zz 



+ (d z ,b z ) T -i 



{b z ,b z ) T -i 
+ (b z ,d z ) T -i 



(b z ,b z ) 



T -i 



(d z ,d z )r 
(b z ,b z ) 



(21) 



T -i 



where equality holds if and only if the Pick matrix ( f2QT > is 
singular. From this, the first part of Proposition [12] follows. 
Since the maximum is obtained when equality holds in ( |2"T1 ). 
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the associated Pick matrices are singular. Hence the solutions 
are unique and correspond to measures with support on n + 1 
points Q3] Proposition 2]. ■ 

Background on orthogonal polynomials and Schur coefficients 

Let c be a nonnegative covariance sequence with corre- 
sponding measure dp, and consider the inner product 

1 



(a(z), &(*)> = — / a(e*)b(e«>)dn(6). 
Ztt J t 

The so-called orthogonal polynomials (of the first kind) 
4>k{z) l22l are (uniquely defined) monic polynomials with 
dcg <t>k {z) = k, k = 0, 1 . . ., which are orthogonal with respect 
to (■,■}. They are shown l22l to satisfy the recursion 



(j>k+i(z) = z(f> k (z) - 7fe0 fc (z)* , 
(f>k+i{z)* = <j>k{z)* - zj k (j> k (z), 



(22) 



where <f>k{z)* = z k <f)k{z~ 1 ) and {~/k}kLi are tne so-called 
Schur parameters. 

The orthogonal polynomials of the second kind are defined 

1 

Mz) = -l(f(z-i))Mz)}+, 

Co 

where [•] + denote "the polynomial part of". They are also "or- 
thogonal polynomials" but with respect to a certain "inverted" 
covariance (corresponding to the negative of the original Schur 
parameters, cf. J22|) and satisfy the recursion 



V>)t+i(z) = zipk(z) + 7fcV'fc(z)* ! 



(23) 



The positive-real function f(z) = H[dp](z) may be ex- 
pressed using the orthogonal polynomials as 



f(z) 



CO 



ipkjz)* + zs k +i{z)ipk{z) 
(f>k(z)* - zs k+ i(z)(j) k (z) ' 



(24) 



where s k +i(z) belong to the Schur class S, i.e. the class of 
analytic functions on E> uniformly bounded by 1. Equations 
( I2212H lead to 



1 + zsi(z) 



f{z) = Co i- ZSl (zy Sk{z)= i 



Ik + zs k+1 (z) 



(25) 



For n — 0, we have <fio( z ) — ipo( z ) — 1- The expression in 
is 



1 



asi 1 

= C : 



2c a 



-a + Si 



co- 

l — asi 1 — 

hence tq = 2coa/(l — |a| 2 ), where s k without argument 
denotes s k (a). Next, consider the radius of d26l for n = k — 1. 
The set ( |26l ) is the range of a Mobius transform applied to 
Sfe G D, and may be represented as 

Vk-i + s k 



M k -l 



I"k — 1 ... 

5 Vfc_i- 



(27) 



1 + Ufc_iSfc 

where M k -i and rfe_i are the center and radius of the disc, 
respectively, and where k -i G (— 7r,7r], Vfc-i G D. From the 
recursion ( l25b . is can be seen that 



where 



1 + Vk-iSk l + 7fcw fe -i l + ?; fc asfc+i' 
Vk-l +lk 

Vk = 



1 + JkVk-l 

The set ( l26b for n = k is therefore 

1 + 7feVfc_i i] k + as k+ i 



k-l 



'Tk-V 



Sfe+1 G 



1 + 7fcW fc _i 1 + ?7fcasfc + i 
A Mobius transformation (a + bs)/(c + ds), with \c\ > \d\. 
maps the unit disc to a disc of radius \ac — bd\/(\c 2 \ — \d\ 2 ). 
Therefore, the radius 



r k 



r k -i a 



1-1% I 



1~M 2 M 2 
is maximized when rj k = 



r k -i\a\ 1 



|%| 2 (1-H 2 ) 



l-M 2 M a 
0, or equivalently when 7^ 



—Vk-i. Hence, r k < rfe_i|a| with equality if 7^ = -V k -i- 
By induction, the maximal radius is given by 

r n =r \a\ n = 2c \a\ n+1 /(l-\a\ 2 ). 

Furthermore, jf. = —Vk-l m the recursion (l25l correspond 
to the Schur parameters 71 = a, and j k — for k = 
2, . . . , n. This leads to the covariance sequence Co :rl = 
(co, coa, . . . , coa n ). Since Sk is defined as the maximal 
diameter over all a G K, the inequality 

4|a|™ +1 c 
Ps K (-^c :„ ) = max p Sa ( J 7 Co . n ) < max ■ 



+ 27fcS fc+ i(z) : 

for fc = 1,2,.... For a complete exposition on orthogonal 
polynomials and Schur's algorithm see JTJ, 11221 . Il25l . 
Proof: [Theorems [73] and \18V 
Following our earlier notation, let 

P8 a {Fc (hn ) = max.{\P[diM}](a)-P[dm](a)\ : dfi ,d^i G F Co . n } ft z \ 

be the uncertainty diameter at the point a G K. By using (l24l 
and noting that 

P[d(j](a) = meH[dn\(a) = <Ref(a), 

the diameter P5 a (F CQ n ) is equal to the diameter of the disc 

^n{a)* + as n+ i{a)ip n (a) 



a£K 



aeK 1 



of 



holds and is achieved for co :n = (co, coa, . . . , coa n ) where 
a G K maximizes |a| n+1 /(l — |a| 2 ). 

In the Nevalinna-Pick case, the recursion is identical to 
except that z is replaced by the inner factor £fc(z) = {z k 
z)/(l - z k z) 

1 

w 



zsi(z) 7 fe + £ k (z)s k +i(z) 

Sk{z) ~ 



co- 



s„+i(a) G D, 



(26) 



1-Z3i(z)' l+Jk€k(z)Sk+l(z)' 

for k = 1,2, ...,n lfT3l . The argument here is analogous 
to the covariance case. The shrinkage of the radius is r k < 
r k-i\£,k{oi)\, with equality when the parameters in the recur- 
sion are (a, 0, . . . , 0), as in the covariance case. The bound of 
the uncertainty diameter at a then becomes 

^ \a\U n z=1 Ma)\ _ Aw Q \B z (a)\ 



</>«(«)* - as n+1 (a)(t> n (a) 

where <p n and ip n are specified via d22l . d23l . by the Schur se- 
quence (71, . . . ,7„) corresponding to Co : „ (see |25l ). Denote 
by r n the radius of (1261 . and hence ps a {^c a . n ) = 2r n . 



2r„ < 



(28) 



l-|af l-|a| 2 
which is attained when w k = wq(1 + z k a)/(l — z k a) for 
k = 1, ...,n. Since ps a {J~z,w) = 2r n , maximizing j28l for 
a G K gives the bound ( fT6l . ■ 
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